Measuring Word Relatedness Using Heterogeneous Vector Space Models
نویسندگان
چکیده
Noticing that different information sources often provide complementary coverage of word sense and meaning, we propose a simple and yet effective strategy for measuring lexical semantics. Our model consists of a committee of vector space models built on a text corpus, Web search results and thesauruses, and measures the semantic word relatedness using the averaged cosine similarity scores. Despite its simplicity, our system correlates with human judgements better or similarly compared to existing methods on several benchmark datasets, including WordSim353.
منابع مشابه
Alternative measures of word relatedness in distributional semantics
This paper presents an alternative method to measuring word-word semantic relatedness in distributional semantics framework. The main idea is to represent target words as rankings of all co-occurring words in a text corpus, ordered by their tf – idf weight and use a metric between rankings (such as Jaro distance or Rank distance) to compute semantic relatedness. This method has several advantag...
متن کاملMeasuring author research relatedness: A comparison of word-based, topic-based, and author cocitation approaches
Relationships between authors based on characteristics of published literature have been studied for decades. Author cocitation analysis using mapping techniques has been most frequently used to study how closely two authors are thought to be in intellectual space based on how members of the research community co-cite their works. Other approaches exist to study author relatedness based more di...
متن کاملCombining Word Representations for Measuring Word Relatedness and Similarity
Many unsupervised methods, such as Latent Semantic Analysis and Latent Dirichlet Allocation, have been proposed to automatically infer word representations in the form of a vector. By representing a word by a vector, one can exploit the power of vector algebra to solve many Natural Language Processing tasks e.g. by computing the cosine similarity between the corresponding word vectors the seman...
متن کاملMeasuring semantic relatedness with vector space models and random walks
Both vector space models and graph randomwalk models can be used to determine similarity between concepts. Noting that vectors can be regarded as local views of a graph, we directly compare vector space models and graph random walk models on standard tasks of predicting human similarity ratings, concept categorization, and semantic priming, varying the size of the dataset from which vector spac...
متن کاملRepresentation Learning for Measuring Entity Relatedness with Rich Information
Incorporating multiple types of relational information from heterogeneous networks has been proved effective in data mining. Although Wikipedia is one of the most famous heterogeneous network, previous works of semantic analysis on Wikipedia are mostly limited on single type of relations. In this paper, we aim at incorporating multiple types of relations to measure the semantic relatedness betw...
متن کامل